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ABSTRACT 



There are advantages to using a priori or planned 
comparisons rather than omnibus multivariate analysis 
of variance (MANOVA) tests followed by post hoc or a 
posteriori testing. A small heuristic data set is used 
to illustrate these advantages. An omnibus MANOVA test 
was performed on the data followed by a post hoc test 
(discriminant analysis) . A second MANOVA using Helmert 
coding as a means for a planned comparison test was run 
and the results were compared to those from the MANOVA 
followed by the unplanned tests. The results provide 
concrete examples of the different possible outcomes 
using various procedures. Two advantages of using 
planned or a priori contrast over post hoc tests are 
presented. In addition to increasing the power against 
Type II error rate, planned comparisons force the 
researcher to be more thoughtful in the design of the 
research. 



PLANNED COMPARISONS 1 
Researchers use a variety of analytical methods to 
evaluate the differences between means. However, there 
is considerable controversy surrounding the various 
methods. Bray and Maxwell (1985, p. 8) describe the 
controversy in two areas: 

1) Issues concerning the overall test, such 
as choices between test statistics, power and 
sample size concerns, and measures of effect 
size, and 2) methods for further analyzing 
and interpreting group differences. 
The most commonly used analytical techniques to find 
differences in means are analysis of variance methods: 
analysis of variance (ANOVA) , multivariate analysis of 
variance (MANOVA) , analysis of covariance (ANCOVA) and 
multivariate analysis of covariance (MANCOVA) , 
hereafter referred to as "OVA methods" (Thompson, 
1985) . 

This paper will focus on the use of MANOVA, but 
the reader is encouraged to read Stevens (1990) for an 
explanation of the other OVA methods. The forerunner 
to MANOVA, univariate analysis of variance (ANOVA) , 
developed in the 1920s by Fisher, has been used 
extensively in social science and experimental research 
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PLANNED COMPARISONS 2 
since its inception, MANOVA, on the other hand, had 
limited use when it was first conceptualized because of 
perceived computational complexity. Within the last 
two to three decades, however, MANOVA has been deemed 
more usable in experimental research, partly due to the 
accessibility of computer programming for MANOVA 
(Swaminathan, 1989) . 

Throughout the years, the popularity of all OVA 
methods has evidenced itself in research studies, 
although investigations of research trends indicate use 
of these methods has declined more recently. Edgington 
(1974) reported on APA journals from 1948 to 1962, 
Seventy-one percent of the articles that used 
inferential statistics utilized ANOVA procedures. 
Likewise, Willson (1980) found that from 1969-1978, OVA 
methods were used in 56 percent of the articles 
published in the American Educational Research Journal 
(AERJ) . In 1985 Goodwin and Goodwin reported that 37 
percent of the AERJ articles from 1979-1983 used OVA 
techniques, while Elmore and Woehike (1988) showed that 
ANOVA and ANCOVA methods made up 25 percent of the 
techniques used in articles published between 1978 and 
1987. Daniel (1989, p.l) concluded, "Thus there is 
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PLANNED COMPARISONS 3 
some trend away from the use of OVA methods in 
educational research, although these methods are still 
used with considerable frequency." 

Many researchers have criticized the use of OVA 
methods in recent years. For example, Cohen (1968, p. 
441) describes the use of OVA methods as sometimes 
necessitating the "squandering [of] much information." 
Thompson (1986) explains that most OVA methods require 
all independent variables be nominally scaled, even 
though most are scaled higher. For example, an 
intervally scaled independent variable might be reduced 
to "low, medium, high," thus eliminating data and 
misrepresenting the reality underlying the data already 
collected (Daniel, 1989) . Cohen (1968, p. 441) 
discusses the reduction of power against Type II errors 
resulting from decreased reliability levels of 
variables that were originally higher than nominally 
scaled. 

Secondly, some researchers who have used OVA 
methods have thrown out cases to achieve 
proportionality or equal numbers of subjects in each 
cell. This type of "balanced" design provided less 
complicated computations at a time when computers were 
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PLANNED COMPARISONS 4 
not readily available . The situation is distorted when 
researchers eliminate subjects in order to achieve 
"balanced" cells (Cohen, 1968) . 

The debate over the proper use of analytical 
methods does not end once initial statistically 
significant differences among means are found. When 
using traditional applications of OVA methods, the 
researcher tests the omnibus OVA effects and, if a 
statistically significant difference is found, then the 
researcher needs to perform a follow-up test to locate 
where the statistically significant differences occur. 
Huck, Cormier, and Bounds (1974, pp. 87-88) explain: 
If the F corresponding to a main effect is 
significant, a researcher will know that 
there are significant, difference among the 
overall means for the levels making up the 
factor. However, he will not know which 
specific levels are significantly different 
from one another. To answer this question, a 
researcher will need to apply a follow-up 
test . 

However, there is controversy surrounding the choice of 
follow-up tests. Many "unplanned" or "post-hoc" tests 
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PLANNED COMPARISONS 5 

(also ca )d "aposteriori" tests) are available, 

including Scheffe, Tukey, or Duncan tests (Thompson, 
1988) . " Following a MANOVA, discriminant analysis 
(which will be the focus of this paper) may be used in 
order to find out "which characteristics are most 
powerful discriminators" (Klecka, 1980, p. 9) . 

Post hoc comparisons usually entail performing 
numerous analyses involving all possible comparisons of 
means, even though this practice may not always be 
appropriate. Thompson (1990, p. 11) states: 

Some researchers always test even omnibus 
effects that are not of interest because they 
naively believe that such analyses always 
increase the probability of detecting 
statistically significant effects on the 
omnibus hypotheses that are of interest. 
When a researcher performs comparisons of all means, 
several hypotheses are tested. When several hypotheses 
are tested within one study, there is an inflation of 
experimentwise Type I error rate, i.e., the possibility 
of making a Type I error somewhere in the study. Fish 
(1988) gives the calculations of "experimentwise" error 
rates for studies involving one sample of research 
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PLANNED COMPARISONS 6 
participants while varying the "testwise" alpha levels 
and the numbers of perfectly uncorrelated dependent 
variables. If a researcher conducts five tests, all at 
the .05 level (5% chance of making a Type I error), the 
probability of making a Type I error somewhere 
(anywhere) in the study does not remain at the .05 
level. As proved by Love (1988), the experimentwise 
alpha rate actually may escalate as high as 0.226219 
[1- (1-.05) 5 ], indicating there is approximately a 23% 
chance of making a Type I error somewhere in the study. 
Hence, the experimentwise error rate may not equal the 
nominal testwise alpha level used with each separate 
hypothesis tested (Thompson, 1990) . To avoid inflating 
experimentawise Type I error rate, certain statistical 
adjustments are incorporated in post hoc procedures. 
For example, Bonferroni corrections revise "testwise" 
alpha levels by dividing the nominal level by the 
number of tests (Fish, 1988) . However, these 
adjustments decrease power against Type II errors, a 
trade-off that causes many researchers to use 
alternative analytical methods for comparing group 
means (DuRapau, 1988)^ 

"Planned" (also called "a priori" or "focused") 
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PLANNED COMPARISONS 7 
comparisons have been recommended as a valuable 
alternative to post hoc tests. Planned comparisons can 
be orthogonal (which is the focus of this paper) or 
nonorthogonal . With orthogonal planned comparisons, 
decisions regarding the null hypothesis of one contrast 
are not influenced by the decisions regarding any other 
comparison (DuRapau, 1988) . 

One of the important advantages of planned 
comparisons is the protection they offer against Type 
II error. Kerlinger and Pedhazur (1973, p. 131) 
explain: 

The test of significance for a priori, or 
planned comparisons are more powerful than 
those for post hoc comparisons. In other 
words, it is possible for a specific 

4 

comparison to be not significant when tested 
by post hoc methods but significant when 
tested by a priori methods. 
Planned comparisons have a second and more 
important advantage. The planning involved in an a 
priori procedure forces the researcher to be more 
thoughtful in the design of the research (Thompson, 
1988), basing analytic methods on the researcher's 
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sense of the relationships among the variables < most 

V 

worthy of study. A researclrer must carefully 
scrutinize, ahead of time, the combinations of 
comparisons that will most likely contribute to and 
enhance the research effort. 

PURPOSE 

The purpose of the present paper is to demonstrate 
with concrete examples (a) the problems surrounding the 
use of post hoc procedures in MANOVA, and (b) the 
advantages of using planned^ comparisons . A small data 
set is used to illustrate the discussion. 



METHODS 

Subjects 

Third grade teachers from 16 different schools in 
one Louisiana school district were involved in a 
science grant. The three year grant involves the 
collaborative efforts of the University of New Orleans 
anr 3 the Louisiana Nature and Science Center, to assist 
teachers in teaching science to students who learn 
differently. For the purposes of this paper two 
portions of data, teachers' responses to a modified 
version of a Concerns Questionnaire (Hall, George, & 
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PLANNED COMPARISONS 9 
Rutherford, 1977) and their scores on the life science 
portion of the National Assessment of Educational 
Progress (NAEP) , were selected fro*r* an array of data. 
The modified Concerns Questionnaire measured the level 
of concern associated with the innovation. Twenty-four 
teachers were randomly assigned to three different 
groups, with each group containing eight teachers. 
Groups 1 and 2 were experimental groups and Group 3 
acted as a control. Teachers in Group 1 received 
science training as well as training on a computer 
retrieval bulletin board system. Group 2 teachers 
received only the training on the bulletin board 
system. Group 3 teachers received no training. 

Table 1 presents the data collected for each 
group. For heuristic value, the NAEP data for Group 2 
were manipulated. Hence, no substantive interpretation 
of the results is intended, although the analyses do 
indicate the methodological issues mentioned 
previously. Case 122 actually received a score of 32 
which was changed to 25. Likewise, Case 134' s NAEP 
score was changed from 49 to 40. 



12 



PLANNED COMPARISONS 10 



Insert Table 1 
about here 



Procedures 

At the beginning and end of the year the teachers 
were asked to complete the modified Concerns 
Questionnaire (see Appendix A for a copy of the 
questionnaire) and to answer the life science questions 
on the NAEP for ages 9, 13/ and 17. Only the post year 
scores were included in the present study. To 
illustrate the potential problems of using post hoc 
procedures, a traditional MANOVA was run, testing the 
overall omnibus hypothesis of no difference between the 
means of the three groups. To further describe and 
explain differences, a discriminant analysis was then 
run as a post hoc procedure. 

MANOVA results were then rerun using planned 
comparison tests. Variations in results when using the 
planned comparisons as opposed to post hoc procedures 
were noted. For the purposes of illustrating the 
planned comparison procedure, two Helmert contrasts (a 
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PLANNED COMPARISONS 11 
type of planned comparison) were performed. Helmert 
contrast are appropriate for "equal n, uncorrelated" 
contrasts. A Helmert contrast implies that 
statistical significance on one Hermert contrast in no 
way suggests statistical significance on the other 
contrast. As Stevens (1992, p. 219) explains the 
process, "To determine the unique contribution a given 
contrast is making we need to partial out its 
correlations with the other contrasts." 

In the first contrast the average scores of the 
treatment groups (Groups 1 and 2) were compared to 
Group 3. In the second contrast, Groups 1 and 2 were 
compared. To run the correct comparisons, the 
researcher renamed the groups "GR" and switched the 
numbers assigned to Groups 1 and '3. Without switching 
the names the Helmert contrast would have used Group 1 
as a control group and Groups 2 and 3 as treatment 
groups. It is important to note that when discussing 
the first MANOVA and the discriminant analysis, the 
word "Group" is used in reporting the results. Group 1 
and 2 were the treatment groups and Group 3 was the 
control. When discussing the contrast procedure, the 
variable name "GR" is used for the grouping variable, 
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indicating that GR 1 was the control group and GR 2 and 
3 are the two treatment groups* As indicated in Table 
2, the outcome variable values did not change, only the 
numerical designation of the groups changed. 



Insert Table 2 
about here 



RESULTS 

The traditional MANOVA* post hoc discriminant 
analysis, and MANOVA with planned comparisons were run 
in SPSS. Table 3 reports the mean scores and standard 
deviations for all three groups on both dependent 
variables . 



Insert Table 3 
about here 



Groups 1 and 2 answered the questions on the 
Concern Questionnaire in a similar manner, as evidenced 
by their mean scores (77.125, and 77. '50, 
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PLANNED COMPARISONS 13 
respectively) . Group 3's mean score was higher 
(82.6250). Group 2 had the largest variance about the 
mean (SD=14 .089) . 

On the NAEP measure, Group 3 had the highest mean 
score (44.00), followed closely by Group 1 (43.625). 
Again, Group 2 had the lowest mean score (35.500) and 
highest variance about the mean (SD-7.309). 

Results of Traditional MANOVA with Follow-up 
Discriminant Analysis 

The results of the MANOVA indicate that there is 
no statistically significant difference betweer the 
means of Groups 1,2 and 3 (Wilks' lambda = .63558), 
although the p value of the calculated F (.054) is only 
slightly larger than the critical £ of .05. Table 4 
displays the results for several classic MANOVA tests 
of statistical significance. 



Insert Table 4 
about here 



Since there was no statistically significant 
difference between groups, a post hoc test would not be 
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PLANNED COMPARISONS 14 
necessary; however, the researcher ran a discriminant 
analysis on the data to determine where the differences 
were among the groups. Those differences are 
understood by examining the discriminant functions 
(Heausler, 1987) . Table 5 reports the canonical 
discriminant functions . 



Insert Table 5 
about here 



The Wilks' lambda indi- ates that 36.44% of the 
variance is explained by both functions and that 4,35% 
of the variance in the groups is explained by Function 
II. Statistical significance was not found, even 
before Function I (p = 0.0542) was extracted. 

Heausler (1987) emphasizes the importance of 
examining the structure coefficient matrix. The 
structure coefficients are the correlations of each of 
the dependent variables with each set of discriminating 
function scores. Table 6 displays the structure 
matrix . 
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Insert Tables 6 
about here 



The two dependent variables have absolute values 
that are dissimilar on Functions I and II. As 
indicated in Table 6, NAEP scores contributed most 
highly (structure coefficient = .998) to Function I, 
while scores on the Concern Questionnaire contributed 
most highly (structure coefficient = .99) to Function 
II . Both dependent variables contributed to the 
separation along both dimensions, with the dependent 
variables having opposite effects on Function II. 

The group centroids (average discriminant scores 
of the subjects in a given group for a given function) 
are presented in Table 7. The centroids for Function I 
indicate that all three groups are somewhat different. 
Notice that Group 3 has the largest positive centroid. 
As stated earlier, the NAEP has a very high, positive 
structure coefficient on Function I. This would lead 
the researcher to expect somewhat higher scores on the 
NAEP for the people in Group 3. These results are 
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PLANNED COMPARISONS 16 
consistent with the cell means reported in Table 3. 
For Function II, Group 3 has the largest positive 
centroid. The Concern Questionnaire has a very high, 
positive structure coefficient on Function II. This 
would lead the researcher to expect higher scores on 
the Concern Questionnaire for the people in Group 3. 
Again, the results shown in Table 3 indicate a higher 
mean score on the Concern Questionnaire for Group 3. 



Insert Table 7 
about here 



Results of MANOVA Using Planned Comparisons 
As previously indicated, Helmert contrasts were 
utilized to perform the planned comparison tests. 
Table 8 displays the results of the first Helmert 
contrast and Table 9 reports the results of the second. 
The results of the first Helmert contrast indicate that 
the control group does not differ to a statistically 
significant degree from the average of the two 
treatment groups on the set of two variables (jo > .05) . 
However, the results of the second Helmert contrast 
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indicate that the two treatment groups differ to a 
statistically significant degree (£ < .05) on the set 
of two outcome variables. 



Insert Tables 8 and 9 
about here 



DISCUSSION 

Based on the results of the overall MANOVA, the 
omnibus test was not statistically significant. 
Therefore the null hypothesis of equality among all 
means was not rejected. In traditional practice, 
post hoc tests would not be conducted following a non- 
statistically significant result; however, for the 
purposes of comparing these initial results with the 
planned comparison tests to follow, a post hoc 
discriminat analysis was conducted. The results of the 
discriminant analysis added more distinguishing 
information about the groups, but also indicated no 
statistically significant differences. In performing 
the two Helmert contrasts, the results changed for one 
of the contrast. Statistical significance was found in 
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PLANNED COMPARISONS 18 
the difference between the treatment groups / but not in 
the contrast between the control group and the 
combination of the treatment groups. 

If statistical significance were used as the 
criteria for meaningful results, the researcher would 
draw different conclusions if a post hoc test were used 
rather than a planned comparison test. Thus, 
"Planned" or "a priori" tests have important advantages 
over "unplanned" or "post hoc" tests. These procedures 
provide greater statistical power against Type II 
error, and help locate specific sources of variance. 
Planned comparisons also force the researcher to think 
about the relationships among groups in advance. 

For example, in this study the researcher planned 
to statistically test the most logical comparisons. 
When comparing the control group to the treatment 
groups, the researcher attempted to maximize the 
opportunity to show differences between those groups 
receiving some innovation and the group receiving none. 
In the second contrast the researcher focused on the 
differences between the two innovations. A "blind" 
post hoc analysis could test all comparisons, but the 
researcher only questioned the differences between the 
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PLANNED COMPARISONS 19 
presence and absence of an innovation and the 
differences among innovations. 

The results with the heuristic data support Sowell 
and Casey's (1982, p. 119) thesis, "planned comparisons 
are the most powerful comparison tests available 
Sowell and Casey go on to state that none of the post 
hoc procedures "has the power of planned comparison 
tests for detecting statistical significance." 
Consequently, as Benton (1989) and Tucker (1991) 
suggest, researchers and the research effort would 
benefit from more frequent use of planned comparison 
tests . 
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TABLE 2: CORRECTED PRESENTATION OP DATA FOR CONTRASTS 
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PLANNED COMPARISONS 27 
TABLE 3: MEANS FOR CONCERN QUESTIONNAIRE AND NAEP 



CONCERN 6R0DP MEAN SD 

1 77.125 10.006 

2 77.750 14.089 

3 82.625 8.314 

NAEP 1 43.625 5.902 

2 35.500 7.309 

3 44.000 4.071 

TABLE 4: MULTIVARIATE TESTS OF SIGNIFICANCE FOR 

EFFECT GROUP 

Test Name Value Appox. F Hypoth.DF ErrorDF Sig.of F 
Pillai's .37902 2.45514 4.0 42.00 .060 

Hotelling's .55041 2.61445 4.0 38.00 .050 

Wilks' .63558 2.54343 4.0 40.00 .054 

Roys .33551 

TABLE 5: STANDARDIZED CANONICAL DISCRIMINANT FUNCTION 

COEFFICIENTS 

FUNCTION 1 FUNCTION 2 

CONCERN 0.06624 1.00173 

NAEP 0.99197 -0.15445 
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PLANNED COMPARISONS 28 
TABLE 6: STRUCTURE MATRIX 



FUNC 1 FUNC 2 

NAEP 0.99782* -0.06599 

CONCERN 0.15384 0.98810* 



TABLE 7: CANONICAL DISCRIMINATE FUNCTIONS EVALUATED AT 
GROUP MEANS (GROUP CENTROIDS) 



GROUP FUNC 1 FUNC 2 

1 0.42127 -0.25222 

2 -0.93838 0.01661 

3 0.51711 0.23561 



TABLE 8: MULTIVARIATE TESTS OF SIGNIFICANCE FOR HELMERT 

CONTRAST 2 



Test Name Value Appox. F Hypoth.DF ErrorDF Sig.of F 

Pillai's .26788 3.65890 2.0 20.00 .044 

Hotelling's .36589 3.65890 2.0 20.00 .044 

Wilks' .73212 3.65890 2.0 20.00 .044 
Roys .26788 
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TABLE 9: MULTIVARIATE TESTS OF SIGNIFICANCE FOR HELMERT 

CONTRAST 1 

Te3t Name Value Appox. F Hypoth.DF ErrorDF Sig.of F 

Pillai's .15578 1.84520 2.0 20.00 .184 

Hotelling's .18452 1.84520 2.0 20.00 .184 

Wilks' .84422 1.84520 2.0 20.00 .184 
Roys .1557 8 
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Appendix A 

CONCERNS 
QUESTIONNAIRE 

1 . I have very limited knowledge about the innovation. 

2. I am concerned about not having enough time to 
organize myself each day. 

3. I now know of some other approaches that might 
work better. 

4. ! would like to help other faculty in their use of the 
innovation. 

5. I am concerned about how the innovation affects 
students. 

6. I am not concerned about this innovation. 

7. I would like to know who will make the decisions 
about the innovation. 

8. I would like to know what resources are available 
if we decide to adopt this innovation. 

9. I am concerned about my inability to manage 
all the innovation requirements. 

10. I am concerned about evaluating the impact 
of the innovation on students. 



12. I ani completely occupied with other things. 

13. I would like to excite my students about their part 
in this approach. 



NOT TRUE 
OF ME NOW 



ooo 
ooo 



SOMEWHAT VERY TRUE 

TRUE OF OF ME NOW 
ME NOW 

ooo oo 

ooo oo 



ooo ooo oo 
ooo ooo oo 
ooo ooo oo 



ooo 
ooo 

ooo 

ooo 

ooo 



11.1 would like to revise the innovation's instructional (~\ S~\ s~\ 
approach. W vJw 



ooo 
ooo 



ooo oo 
ooo oo 

ooo oo 

ooo oo 

ooo oo 

ooo oo 

ooo oo 
ooo oo 



1 4. I would like to know what the use of the innovation (^) 
will require in the immediate future. 



1 5. I would like to coordinate my effort with others to 
maximize the innovation's effects. 



ooo 



ooo 
ooo 



oo 
oo 



16. I would Hke to have more information on time and CDCDO QOO (~) C~) 
energy commitments required by the innovation. 
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